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EXECUTIVE  SUMMARY 


This  report  describes  the  results  of  the  project  “More  reliable  wireless  communications  through  polar 
codes,”  funded  in  fiscal  year  2014  by  the  Naval  Innovative  Science  and  Engineering  (NISE)  program  at 
SSC  Pacific. 

OBJECTIVE 

The  purpose  of  the  project  is  to  determine  if  polar  codes  can  outperform  the  forward  error  correc¬ 
tion  currently  used  in  Navy  wireless  communication  systems.  This  report  explains  polar  codes  to  non¬ 
specialists. 

RESULTS 

The  project  team  has  written  and  tested  software  that  implements  several  published  polar  coding  al¬ 
gorithms.  For  comparison,  we  have  also  implemented  and  tested  other  forward  error  correction  methods: 
a  turbo  code,  a  low  density  parity  check  (LDPC)  code,  a  Reed-Solomon  code,  and  three  convolutional 
codes. 
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1.  INTRODUCTION 


Forward  error  correction  (FEC)  is  a  method  for  improving  the  reliability  of  digital  communications. 
This  is  especially  important  for  wireless  communications,  which  arc  subject  to  noise,  fading,  and  interfer¬ 
ence.  It  is  common  that  a  transmitter  will  send  a  0,  but  the  receiver  receives  a  1,  or  vice  versa.  FEC  adds 
redundancy  to  digital  messages  before  transmission.  The  goal  of  FEC  is  that  even  if  some  bits  are  flipped, 
the  receiver  can  use  the  redundant  information  to  reconstruct  the  original  message. 

The  U.S.  Navy  uses  FEC  in  most  of  its  wireless  communications.  Navy  systems  use  several  types 
of  FEC,  with  turbo  codes  being  the  most  common.  Many  civilian  systems  use  low  density  parity  check 
(LDPC)  FEC  codes,  and  the  Navy  is  planning  to  use  LDPC  for  some  future  systems. 

Polar  codes  arc  a  new  form  of  FEC.  The  first  journal  article  about  polar  codes  was  [1]  by  E.  Arikan, 
published  in  2009.  This  paper  made  a  great  impact  in  the  academic  community.  Polar  codes  arc  of  interest 
primarily  for  theoretical  reasons,  but  they  may  also  have  practical  use  as  a  replacement  for  turbo  or  LDPC 
codes. 

The  author  of  this  report  and  his  project  team  have  studied  polar  codes  to  determine  if  they  can  im¬ 
prove  the  reliability  and  throughput  of  Navy  wireless  communications.  This  research  was  funded  in  fis¬ 
cal  year  2014  by  the  Naval  Innovative  Science  and  Engineering  (NISE)  program  at  SSC  Pacific.  We  have 
found  the  polar  coding  literature  hard  to  read.  The  main  purpose  of  this  report  is  to  make  it  easier  for  oth¬ 
ers  to  learn  what  we  have  learned.  In  addition,  many  polar  coding  papers  describe  complex  algorithms  but 
do  not  provide  source  code.  We  have  written  our  own  source  code  to  implement  these  algorithms  [2],  and 
this  report  should  help  guide  the  use  of  our  code.  Distribution  of  the  source  code  is  authorized  only  to  U.S. 
Government  agencies  and  their  contractors. 

The  author  has  done  his  best  to  make  this  report  readable  by  non-specialists.  To  understand  this  report, 
the  reader  should  be  comfortable  with  the  basic  concepts  of  probability,  including  conditional  probability. 
The  reader  should  also  know  a  few  common  mathematical  symbols  for  sets  and  functions. 
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2.  BLOCK  CODES 


Let  N  and  K  be  positive  integers  with  K  <  N.  An  (N.  K)  block  code  is  a  function  from  {0,  1 }  K 
to  {0, 1}^,  i.e.,  its  input  is  a  vector  of  K  bits  and  its  output  is  a  vector  of  N  bits.  An  encoder  is  an  im¬ 
plementation  of  the  function  in  software  or  hardware. 1  N  is  called  the  block  size  or  block  length.  K/N  is 
called  the  code  rate. 


We  usually  use  the  symbol  u  for  the  code’s  input,  and  x  for  its  output.  The  bits  of  u  are  called  infor¬ 
mation  bits,  because  they  are  the  information  that  the  sender  wants  the  receiver  to  receive.  The  bits  of 
x  are  called  code  bits.  We  send  x  through  a  channel,  and  the  channel  output  is  called  y.  A  decoder  is  a 
system  designed  to  undo  the  effects  of  a  particular  encoder  and  channel.  It  takes  y  as  input,  and  outputs 
u  e  {0, 1}A,  with  the  goal  that  u  =  u.  The  block  error  rate  (BLER)  is  the  probability  that  u  f  u.  i.e., 
the  probability  that  the  block  is  not  decoded  correctly.2  The  bit  error  rate  (BER)  is  the  average  probability 
that  a  bit  is  not  decoded  correctly.  (We  must  say  “average”  because  different  positions  with  u  may  have 
different  error  probabilities.) 

Example:  Repetition  code  of  rate  1/3.  Let  K  =  1  and  N  =  3,  and  let  the  encoder  map  o  HA  (0,0,0) 
and  1  i-a  (1, 1, 1).  Suppose  the  channel  maps  {0,  l}3  to  {0,  l}3  so  that  each  bit  independently  has  a  10% 
chance  of  being  flipped.  Then  the  decoder  should  use  the  majority  vote  rule:  if  y  is  (0, 0, 0),  (0, 0, 1), 

(0, 1, 0),  or  (1, 0,  0),  then  u  =  0;  if  y  is  (1, 1, 1),  (0, 1, 1),  (1,  0, 1),  or  (0, 1, 1),  then  u  =  1.  The  BLER 
and  BER  are  both  0.028. 


Example:  (7,  4)  Hamming  code  ([3]).  Let  K  =  4  and  N  =  7.  For  any  u 
the  encoder  maps  u  to 

[1  1  1  0  0  0  O' 

10  0  110  0 

T  =  11 

0  10  10  10 
110  10  0  1 


(ui,U2,U3,U4)  e  {0,  l}4, 


with  the  result  computed  modulo  2.  This  encoder  maps  the  16  vectors  in  {0,  l}4  to  16  different  vectors 
in  {0,  l}7  called  codewords.  These  16  codewords  were  chosen  to  satisfy  the  following  property:  each  of 
the  128  vectors  in  {0,  l}7  is  either  a  codeword  or  one  bit  different  from  a  codeword.  Suppose  the  chan¬ 
nel  maps  {0,  l}7  to  {0,  l}7  so  that  each  bit  independently  has  a  5%  chance  of  being  flipped.  The  decoder 
should  find  the  codeword  that  is  either  equal  to  or  one  bit  different  from  the  received  vector,  and  then  com¬ 
pute  the  vector  in  {0,  l}4  that  maps  to  that  codeword.  So  the  decoder  will  be  correct  if  at  most  one  bit  gets 
flipped.  Therefore  the  BLER  is  1  —  (0.95)7  —  T (0.05) (0.95)6  «  0.044.  The  calculation  of  the  BER  is  more 
complicated  and  will  be  omitted. 


2.1  LATENCY 

Latency,  also  called  delay,  is  the  time  it  takes  for  a  particular  information  bit  to  travel  from  the  sender 
to  the  receiver.  The  use  of  a  block  code  adds  to  latency;  we  wish  to  determine  how  much.  We  make  the 
following  assumptions: 

1.  The  information  bits  arrive  at  the  encoder  at  a  uniform  rate  p,  which  is  called  the  data  rate  and  is 
measured  in  bits  per  second  (bps).  The  bits  are  stored  in  the  encoder’s  input  buffer  until  K  of  them 
have  arrived.  The  encoder  then  unloads  the  buffer  and  processes  the  bits. 

'Coding  theorists  often  call  this  function  an  encoder,  and  define  code  as  the  image  of  the  function.  This  report  does  not  use 
these  definitions. 

2Some  authors  use  the  term  frame  error  rate. 
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2.  After  the  encoder  has  processed  the  bits  for  time  T, .  the  N  output  bits  arc  available  in  the  encoder’s 
output  buffer.  They  leave  the  output  buffer  at  a  uniform  rate  jip. 

3.  The  decoder  does  the  same  in  reverse:  the  iV-bit  input  buffer  is  filled  at  rate  p,  the  decoder  pro¬ 
cesses  the  block  in  time  7)/,  and  the  A' -bit  output  buffer  is  emptied  at  rate  p. 

4.  The  encoder  can  only  process  one  block  at  a  time.  It  must  be  fast  enough  that  when  the  last  bit  of 
block  n  arrives,  it  must  be  done  encoding  block  n  —  1  so  that  it  can  immediately  start  encoding  block 
n;  otherwise  some  of  the  bits  of  block  n  will  be  overwritten  by  bits  of  block  n  +  1.  The  same  is 
true  of  the  decoder.  This  assumption  puts  a  limit  on  the  data  rate  at  which  a  particular  encoder  and 
decoder  may  be  used. 

The  time  to  till  either  input  buffer  or  empty  either  output  buffer  is  K / p.  If  block  n  begins  arriving  at  the 
encoder  at  time  t,  then  it  finishes  arriving  at  time  t  +  K / p,  and  begins  leaving  at  time  t  +  K/ p  +  Te.  So  the 
latency  added  by  the  encoder  is  K / p  +  Te.  Similarly,  the  latency  added  by  the  decoder  is  K/ p  +  Trj.  By  the 
last  assumption,  T,j  and  Te  are  both  <  A/p,  so  the  total  latency  added  is  between  2K / p  and  4 A / p. 

For  example,  suppose  A"  =  1024,  Te  =  1 5 s,  and  7)/  =  10~4  s.  If  the  data  rate  is  150  bps,  then 
the  coding  adds  13.7  s  of  latency,  which  is  unacceptable  for  many  purposes.  In  contrast,  if  the  data  rate 
is  1  megabit  per  second  (Mbps),  then  the  coding  adds  2.16  milliseconds  (ms)  of  latency,  which  might  be 
insignificant  compared  to  other  delays  in  the  system.  (For  example,  if  the  signal  is  relayed  by  a  geosyn¬ 
chronous  satellite,  the  propagation  delay  is  about  250  ms.)  The  maximum  possible  data  rate  for  this  de¬ 
coder  is  K / Tfi  =  10.24  Mbps. 
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3.  INFORMATION  THEORY 


In  [4],  C.E.  Shannon  started  the  field  of  information  theory,  which  is  largely  concerned  with  communi¬ 
cation  through  channels.  Figure  1  shows  a  simple  model  of  a  radio  frequency  (RF)  communication  link. 

From  the  perspective  of  an  RF  engineer,  the  channel  is  the  gap  between  the  transmitter  and  the  re¬ 
ceiver,  as  shown  in  Figure  2.  The  input  and  output  are  both  RF  signals. 

From  the  perspective  of  information  theory,  the  channel  is  the  gap  between  the  encoder  and  the  de¬ 
coder,  as  shown  in  Figure  3.  The  input  is  a  bit,  and  the  output  depends  on  the  design  of  the  demodulator. 

An  information  theoretic  channel  is  specified  by  a  set  X  of  input  symbols,  a  set  y  of  output  symbols, 
and  for  each  x  G  X  and  y  £  y,  a  nonnegative  real  number  called  transition  probability  from  x  to  y. 

If  the  channel  is  called  W,  then  the  transition  probability  from  x  to  y  is  written  as  W (y\x).  X  is  usually 
{0, 1},  while  y  may  be  discrete  or  continuous.  If  x  is  input  to  the  channel,  W (y\x)  is  understood  as  the 
probability  of  the  output  being  y.  For  every  x  E  X,  it  is  required  that  Ylyey  W (u\x)  =  1  if  JV  is  discrete, 
or  that  fy  W (y\x)  dy  =  1  if  y  is  continuous. 

Frequently,  we  know  y  and  want  to  determine  x.  In  this  situation,  W (y\x)  is  also  called  the  likelihood 
of  x  given  y. 


Input  bits 


Output  bits 


Figure  1.  Simple  RF  link  model. 

Example  1:  Gaussian  Channel.  Suppose  we  are  using  binary  phase  shift  keying  (BPSK)  modulation 
and  transmitting  over  a  line-of-sight  RF  channel  with  negligible  multipath.  The  Gaussian  channel  is  used 
to  model  this  situation.  The  BPSK  modulator  first  maps  a  bit  to  a  point  in  the  in-phase/quadrature  (I/Q) 
plane,  0  i— >  (1,0), 1  K >  (—1,0).  Henceforth,  we  ignore  the  Q  component  because  it  does  not  affect  the 
demodulator’s  output.  The  I/Q  point  is  then  converted  to  a  sinusoidal  waveform  which  is  transmitted.  The 
receiver  rescales  the  waveform  to  compensate  for  the  attenuation.  The  demodulator  maps  the  waveform 
back  to  I/Q  space,  and  outputs  the  result.  The  overall  effect  is  that  0  i— >  1  +  n  and  1  i — >-  — 1  +  n,  where  n  is 
a  random  noise  variable  that  is  assumed  to  be  normally  distributed  with  0  mean. 

For  any  a  >  0,  the  Gaussian  channel  with  noise  variance  o1  has  output  set  y  =  R,  and  transition 
probabilities  W (y|0)  =  yyjyz  exp (— )  (a  normal  distribution  with  mean  1  and  standard  deviation 

a),  W (y\  1)  =  J—  exj)(— )  (a  normal  distribution  with  mean  —1  and  standard  deviation  a).  This 
channel  is  also  called  an  additive  white  Gaussian  noise  (AWGN)  channel. 
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Input  bits 


Encoder 


Transmitter 


Output  bits 


Figure  2.  RF  channel. 


Input  bits 


Output  bits 


Figure  3.  Information  theory  channel. 


Example  2:  Binary  Symmetric  Channel.  In  the  previous  example,  the  demodulator  output  a  point  in 
I/Q  space.  Suppose  instead  that  the  demodulator  must  decide  if  it  has  received  a  0  or  1.  It  decides  0  if  I  is 
positive,  and  1  if  I  is  negative.  The  probability  of  error  is  f0^  exp(—  )  dy. 

For  any  p  E  [0, 1],  the  binary  symmetric  channel  (BSC)  with  error  probability  p  has  output  set  y  = 
{0, 1},  and  transition  probabilities  W (0|0)  =  FF(1|1)  =  1  —  p,  TC(0|1)  =  W (1 1 0)  =  p. 

The  BSC  is  called  a  hard  decision  channel  because  it  outputs  a  0  or  1  and  gives  no  other  information. 
The  AWGN  is  called  a  soft  decision  channel  because  it  indicates  the  degree  of  uncertainty  in  estimating  the 
channel  input.  This  provides  more  information  for  the  decoder  to  estimate  the  encoder  input. 

All  channels  considered  in  this  report  will  be  binary  memoryless  symmetric  (BMS)  channels  unless 
stated  otherwise.  Binary  means  that  the  input  set  is  A  =  {0, 1}.  Memoryless  means  that  the  when  the 
channel  is  used  multiple  times,  the  transition  probabilities  of  each  use  are  independent  of  the  inputs  and 
outputs  of  other  uses.  A  channel  W  is  symmetric  if  there  is  a  one-to-one  correspondence  c  :  y  — >  y  such 
that  for  all  y  E  T, 
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L  c(c(y))  =  y; 

2.  W(y\0)  =  W(c(y)\l)- 

3.  W(y\l)  =  W(c(y)\0). 

c(y)  is  usually  written  y.  For  the  BSC,  0  =  1  and  1  =  0.  For  the  AWGN  channel,  y  =  —y. 

3.1  CHANNEL  CAPACITY 

Shannon  showed  how  to  compute  the  capacity  C(W)  of  a  channel  W,  which  is  the  highest  rate  at 
which  information  can  be  sent  through  the  channel  with  arbitrarily  low  BER.  Formally,  he  proved  that  for 
any  rational  number  r  <  C(W),  and  every  e  >  0,  there  is  an  code  with  rate  r  and  a  corresponding  decoder 
that  can  decode  the  channel  output  with  BER  <  e,  but  this  is  not  possible  for  any  r  >  C{W).  Shannon’s 
proof  did  not  show  how  to  explicitly  construct  such  a  code.  Furthermore,  the  decoder  in  his  proof  is  a  max¬ 
imum  likelihood  decoder.  For  any  u  e  {0, 1}  K ,  the  likelihood  of  u  is  the  probability  that  this  u  would 
produce  the  given  y.  The  maximum  likelihood  decoder  computes  the  likelihood  of  all  2K  possible  u’s,  and 
outputs  the  one  with  the  maximum  likelihood.  This  requires  0(2K )  steps,  which  is  not  practical  except  for 
very  small  values  of  K. 

Figure  4  shows  the  capacity  of  a  BSC  as  a  function  of  the  error  probability  p.  Figure  5  shows  the  ca¬ 
pacity  of  a  AWGN  channel  as  a  function  of  the  noise  standard  deviation  a. 


Figure  4.  Capacity  of  BSC. 


7 


4.  INTRODUCTION  TO  POLAR  CODES 

Polar  codes  were  introduced  by  E.  Arikan  in  [1].  This  paper  showed  how  to  construct  polar  encoders 
and  decoders  for  any  block  length  N  that  is  a  power  of  2,  and  any  K  <  N.  These  encoders  and  decoders 
are  relatively  efficient,  requiring  0{N  log  N)  instructions,  as  opposed  to  0( 2K )  for  a  maximum  likeli¬ 
hood  decoder.  They  were  the  first  efficient  encoders  and  decoders  proven  to  achieve  the  capacity  of  any 
BMS  channel.  This  is  not  as  useful  as  it  sounds.  Achieving  the  capacity  of  a  channel  W  does  not  mean  ex¬ 
hibiting  a  code  with  rate  C(W).  It  means  exhibiting  an  infinite  sequence  of  codes  such  that  as  N  goes  to 
infinity,  the  code  rate  converges  to  C(W)  and  BER  converges  to  0.  So  to  produce  a  polar  code  with  a  rate 
very  close  to  the  channel  capacity,  we  may  need  a  block  length  that  is  too  large  to  use. 

Reference  [5]  states  that  polar  codes  by  themselves  have  underperformed  state-of-the-art  codes  of  sim¬ 
ilar  block  lengths  and  code  rates.  This  may  not  be  the  most  relevant  comparison  because  the  efficiency 
of  polar  codes  may  make  them  feasible  at  larger  block  lengths  than  other  codes;  on  the  other  hand,  larger 
block  lengths  may  cause  unacceptable  latency. 

Several  authors  have  achieved  better  results  by  combining  polar  codes  with  other  codes,  for  example, 
[5-7].  In  particular,  I.  Tal  and  A.  Vardy  [5]  combined  a  polar  code  with  a  cyclic  redundancy  check  (CRC) 
and  found  that  the  resulting  code  of  length  2048  outperformed  the  length  2304  LDPC  code  used  in  the 
WiMax  standard.  In  [8,  9],  K.  Nui,  K.  Chen,  and  J.R.  Lin  combined  polar  codes  of  length  512,  1024,  and 
2048  with  a  CRC.  They  also  combined  this  CRC  with  turbo  codes  of  the  same  lengths,  and  found  that  the 
polar  codes  outperformed  the  turbo  codes.  However,  [8,  9]  compared  BLERs  and  did  not  give  any  BER 
results. 

Polar  codes  can  also  be  used  for  source  coding  (i.e.,  compression),  but  this  use  will  not  be  discussed  in 
this  report. 
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5.  POLAR  ENCODER 


An  (N,  K )  polar  code  is  a  block  code  with  K  inputs  and  N  outputs.  However,  we  will  begin  by  in¬ 
troducing  a  function  Gat  that  has  N  bit  inputs,  which  are  numbered  from  1  to  N,  and  N  bit  outputs,  also 
numbered  from  1  to  N.  The  input  to  Gat  is  a  row  vector  called  u  =  (u±,U2,  •  •  ■ ,  un),  where  Ui  is  the  bit 
that  goes  into  input  i?  This  disagrees  with  the  notation  in  Section  2,  where  we  stated  that  u  is  a  vector  of 
length  K.  The  output  of  Gat  is  a  row  vector  of  length  N  called  x  =  (x\,X2,  ■  ■  ■ ,  £jv). 

To  construct  a  polar  encoder,  we  choose  K  of  Gat’s  inputs,  and  put  our  information  bits  into  these  in¬ 
puts,  while  the  remaining  N  —  K  inputs  are  frozen,  i.e.,  held  constant.  The  frozen  bits  are  normally  set  to 
0,  but  they  may  have  any  value  that  is  known  to  both  the  encoder  and  the  decoder.  Figure  6  shows  an  (8, 4) 
polar  encoder  in  which  the  frozen  bits  are  u\,  U2,  us,  and  115.  The  set  {4, 6,  7, 8}  is  denoted  A  and  is  called 
the  information  set,  because  the  information  bits  are  sent  to  inputs  4,  6,  7,  and  8  of  Gg.  The  complement  of 
A  is  Ac  =  {1,  2,  3,  5},  which  is  called  the  frozen  set  because  inputs  1,  2,  3,  and  5  of  Gs  are  frozen.  The 
input  to  the  encoder  is  called  u(A).  The  output  from  the  encoder  is  the  same  as  the  output  from  Gn,  and  is 
called  x. 


5.1  CHOOSING  A 

Let  n  be  any  positive  integer  and  let  N  =  2n.  For  any  K  <  N,  we  can  construct  an  (N.  K)  polar 
code  by  choosing  A  to  be  any  A'-element  subset  of  {1,  2, ... ,  N }.  However,  we  must  choose  A  carefully 
to  get  a  good  code.  The  method  for  choosing  A  is  that  we  imagine  decoding  all  N  inputs  of  Gn  with  no 
frozen  bits,  and  determine  the  probability  of  decoding  error  for  each  input.  These  probabilities  depend  on 

3For  some  purposes  it  is  more  convenient  to  used  indices  0  through  N  —  1,  and  many  polar  coding  papers  use  this  convention. 
We  start  at  1  to  agree  with  MATLAB®  indexing. 
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the  channel  W.  We  optimize  the  polar  code  for  W  by  choosing  A  as  the  set  of  inputs  with  the  lowest  error 
probabilities. 


5.2  STRUCTURE  OF  GN 

Gn  is  defined  by  a  simple  recursive  rule.  Figures  7  and  8  show  diagrams  of  6' 2,  64,  and  Gs-  In  these 
figures,  ®  represents  the  modulo  two  sum,  or  equivalently,  the  exclusive  or  operator. 


Figure  7.  G2  and  G4. 


Figure  8.  Gs. 

In  general,  G2n  is  constructed  using  N  copies  of  G 2  and  two  copies  of  Gn,  as  shown  in  Figure  9. 
The  box  marked  R2n  is  a  permutation  called  the  reverse  shuffle,  which  copies  its  odd-numbered  inputs  to 
its  first  N  outputs,  and  copies  its  even-numbered  inputs  to  its  last  N  outputs.  So  the  recursive  formula  for 

G2n(u  1,  .  .  .  ,  U2n)  is  (Gn(ui  ©  U2,  U3  ©  U4,  .  .  .  ,  U2N-1  ©  U2n),  Gn(,U2,U4,  •  •  •  ,  U2N ))■ 

Gn  is  implemented  in  the  MATLAB®  file  WJSJ.m. 
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Figure  9.  Recursive  construction  of  G2N. 
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6.  POLAR  DECODER 


The  decoder  described  in  [1]  is  called  a  successive  cancellation  (SC)  decoder.  It  makes  bit  decisions 
one  at  a  time.  When  decoding  the  zth  bit,  the  data  it  uses  are  the  channel  output  y,  and  the  previous  bit 
decisions  u\  through  Ui-i,  which  it  assumes  are  correct.  It  uses  the  following  rules: 

•  If  the  ith  input  is  frozen,  then  Ui  is  known,  so  set  U{  =  itj. 

•  If  the  ith  input  is  not  frozen,  then  choose  iii  to  be  the  bit  value  that  would  have  a  higher  probability 
to  produce  the  channel  output  y.  This  probability  is  denoted  \yi,  ■  ■  . ,  yN,  ui,  ■  ■  ■ ,  Ui-i\ui). 

To  compute  . . . ,  yN,  u±,...,  Ui-\\ui  =  0),  choose  bit  values  iii+ 1  through  un,  and  let 

x  =  Gn(ui,  ■  ■  ■ ,  Ui- 1,0,  )-i, . . . ,  un ), 

i.e.,  the  encoder  output  when  (in, . . . ,  i ,  0,  u/+i , . . . ,  u..y)  is  the  input  to  Gn ■  The  probability  of  send¬ 
ing  x  through  the  channel  and  receiving  y  is  ri/=i  W We  average  this  value  over  all  2N  ~l  choices 
of  m+i  through  un,  and  the  result  is  W^\yi, . . . ,  yN,  u\, . . .  ,u,i-i\ui  =  0). 

We  then  compute  \yi, . . . ,  yN,  u±, . . . ,  Ui-i\ui  =  1)  the  same  way.  If 

W$(yi,  ...,yN,uu..  .,Ui-i\ui  =  0)  >  W$(yi,..  .,yN,ui,--  .,u,i-i\ui  =  1),  (1) 

we  choose  Ui  =  0;  otherwise,  we  choose  ttj  =  1.  4 

If  we  applied  this  rule  directly,  the  complexity  of  the  decoder  would  be  0(2;V).  Fortunately,  [1]  de¬ 
scribes  a  recursive  method  that  reduces  the  complexity  to  0(iV  log  N). 

After  the  decoder  is  done  computing  u\  through  u;y,  it  outputs  u(A),  the  estimates  of  the  K  non- 
frozen  inputs. 


6.1  LENGTH  2  DECODER 

We  now  describe  the  decoder  in  more  detail  for  the  case  N  =  2.  Figure  10  shows  how  the  decoder’s 
input  is  formed.  The  encoder  maps  the  n’s  (which  may  include  frozen  bits)  to  the  ,x’s,  which  are  transmit¬ 
ted  through  the  channel  W,  and  the  y’s  are  received.  The  y’s  may  be  bits  or  soft  decisions.  It  is  assumed 
that  we  know  the  complete  description  of  W. 


u2 


■»  *2 


Figure  10.  Length  2  encoding  and  transmission. 


Observe  x\  =  u\  0  U2  and  X2  =  U2 ■  Also  u\  =  x\  0  X2  and  U2  =  x2,  so  G2  is  its  own  inverse. 

4Observe  that  ties  are  broken  by  choosing  0.  This  is  the  rule  from  [1],  For  some  purposes  it  is  better  to  break  ties  randomly. 


15 


For  any  y  received,  we  know  the  transition  probabilities  W(y \ x  =  0)  and  W(y\x  =  1).  Let  a  = 
W{yi\x\  =  0),  6  =  W(y\\x\  =  1),  c  =  W(y2\x2  =  0),  d  =  W{y2\x2  =  1).  These  four  numbers  are  all 
the  data  needed  by  the  decoder;  the  values  of  y\  and  y2  are  not  needed. 

To  decode  u\,  we  compute  and  compare  the  probabilities  W(y±,  y2 \u\  =  0)  and  W(yi,  y2\u\  =  1). 
The  y  s  also  depend  on  U2 ,  which  is  equally  likely  to  be  0  or  1.  So 

W(y1,y2\ui  =  0)  =  W(yi,y2\ui  =  0  ,u2  =  0)/2  +  W(yi,y2\ul  =  0  ,u2  =  l)/2 

=  W(yi,y2\xi  =  0,x2  =  0)/2  +  W(yi,y2\xl  =  l,x2  =  l)/2. 

Since  the  transitions  x\  — *  y\  and  x2  ->  1)2  arc  independent,  we  can  factor  the  pairwise  probabilities: 

W(yi,y2\ui  =  0)  =  W(yi\xi  =  0)IL(y2|x2  =  0)/2  +  W(yi\xi  =  1  )W(y2\x2  =  l)/2 
=  ( ac  +  bd)/ 2. 

Similarly, 

W(yi,y2\ui  =  1)  =  W(yi,y2|Mt  =  1,^2  =  0)/2  +  FF(yi,  y2|«i  =  1,«2  =  l)/2 

=  W(yi,y2\xi  =  l,x2  =  0)/2  +  W(yi,y2\x1  =  0,x2  =  l)/2 

=  W(yi\xi  =  l)W(y2\x2  =  0)/2  +  PF(yi|xi  =  0)FF(y2|®2  =  l)/2 
=  ( be  +  ad)/ 2. 

The  ratio  of  these  probabilities  is  (ac  +  bd) / (6c  +  ad).  Dividing  both  numerator  and  denominator  by  bd, 
we  get 

ac  ,  i 

b  1 

c  |  a  * 

This  is  the  likelihood  ratio  (LR)  of  u  i  given  y\  and  y2,  and  it  can  be  expressed  in  terms  of  a/6  and  c/d,  the 
LRs  of  x’i  and  x2.  We  define  the  function  f(p,  q)  =  j/yp  [10]. 

We  choose  u\  =  0  if  /(a/6,  c/d)  >  1;  otherwise,  we  choose  v,\  =  1. 

The  next  step  is  to  decode  u2,  so  we  determine  which  value  of  u2  would  have  a  higher  probability  of 
producing  the  observed  y\  and  y2.  We  compute  the  LR  W(y\ .  y2 1 u2  =  0)/lL(y  | .  y2 1 ?i2  =  1).  According 
to  the  successive  cancellation  (SC)  rule,  we  assume  that  we  have  correctly  decoded  u i .  If  u\  =  0,  then  our 
LR  becomes 

W(y1,y2\ui  =  0 ,u2  =  0)  _  W(yi,y2\x\  =  0,x2  =  0) 

W(yi,y2\ui  =  0  ,u2  =  1)  W(yi,y2\xi  =  l,x2  =  1) 

=  ac/bd 
=  ( a/b)(c/d ). 

If  v.\  =  1,  then  our  LR  becomes 

W(y1,y2\ui  =  1  ,u2  =  0)  _  W(yi1y2\x1  =  l,x2  =  0) 

W(yi,y2\ui  =  1  ,u2  =  1)  W(yi,y2\xi  =  0,x2  =  1) 

=  be /ad 
=  (a/&)_1(c/d). 

In  either  case,  we  again  see  that  the  LR  of  u2  can  be  expressed  in  terms  of  a/6  and  c/d,  the  LRs  of  x\  and 
x2.  We  define  the  function  g(p,  q,  u\)  =  [10]. 
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We  choose  U2  =  0  if  g(a/b,  c/d,  Hi)  >  1;  otherwise  we  choose  U2  =  1. 

Therefore  the  decoder  does  not  need  all  four  values  a,  b,  c,  and  d.  Having  LRs  a/b  and  c/d  is  suffi¬ 
cient. 

One  may  ask,  if  y2  is  a  soft  estimate  of  X2,  and  X2  =  v.2,  why  don’t  we  use  the  same  estimate  for  7/9 ? 
If  X2  is  more  likely  to  be  0  than  1  (or  vice  versa),  then  isn’t  the  same  true  for  U2}-  Usually  it  is.  An  excep¬ 
tion  can  occur  when  the  LR  of  u\  is  <  1  (indicating  u  \  is  more  likely  1)  but  u  \  is  a  frozen  0.  This  means 
that  a  bit  got  flipped  in  transmission.  If  y\  is  a  strong  0  and  7/2  is  a  weak  1 ,  then  7/2  is  more  likely  to  be  in¬ 
correct,  so  we  conclude  X2  =  U2  =  0. 

In  the  next  section,  estimates  of  x  \  and  a; 2  will  also  be  necessary.  These  are  computed  x  1  =  u\  ©  ip) 

and  x'2  =  ii2- 


6.2  EFFICIENT  LENGTH  N  DECODER 

The  decoder  needs  the  following  inputs: 

•  The  LRs  of  the  N  encoder  outputs 

•  The  set  A  C  {1, ... ,  N}  of  frozen  indices 

•  u(Ac),  the  known  values  of  the  frozen  bits 

In  Figure  7,  G4  has  three  columns  that  each  contain  four  bit  values:  the  input  column,  the  output  col¬ 
umn,  and  the  middle  column  that  has  the  bits  u\  0  112, 112,  r/3  0  114,  and  114.  64  also  contains  four  copies  of 
Cb) :  two  on  the  left  that  transform  the  input  bits  to  the  middle  bits,  and  two  on  the  right  that  transform  the 
middle  bits  to  the  output  bits.  Similarly,  in  Figure  8,  we  can  mentally  fill  in  the  missing  details  of  the  64 ’s, 
and  see  that  G%  has  four  columns  that  each  contain  eight  bit  values,  and  between  these  columns  are  three 
columns  that  each  contain  four  copies  of  G2.  In  general,  for  N  =  2n,  Gn  has  n  +  1  columns  that  each 
contain  N  bit  values,  and  n  columns  that  each  contain  N/2  copies  of  G2. 

The  decoder  works  by  following  the  structure  of  Gn,  repeating  the  procedure  in  Section  6.1  for  each 
of  the  nN/2  copies  of  G 2 ■  In  this  way,  it  retraces  the  encoder’s  steps,  and  produces  a  LR  and  a  bit  estimate 
for  each  of  the  (n  +  1)N  bits  in  the  encoder.  The  decoder  has  nN/2  elements  that  each  represent  one  copy 
of  G2.  The  connections  between  these  elements  parallel  the  connections  in  the  decoder.  LRs  are  input  on 
the  right  side  of  the  decoder  (representing  the  encoder  output)  and  are  computed  from  right  to  left.  Bit  esti¬ 
mates  are  first  made  at  the  left,  and  arc  computed  from  left  to  right. 

For  any  bit  b  in  the  encoder,  let  L(b)  be  its  LR,  and  let  b  be  the  decoder’s  estimate.  Each  decoding  ele¬ 
ment  follows  the  steps  listed  below.  In  these  steps  we  will  use  the  symbols  u\,  1.1,2 ,  x\,  and  x'2  for  the  inputs 
and  outputs  of  a  particular  copy  of  G2,  regardless  of  its  place  in  Gn-  One  element’s  x\  may  be  another 
element’s  u\  or  U2- 

1.  If  this  element  is  in  the  rightmost  column,  then  L(x  1)  and  L(x 2)  are  included  in  the  decoder’s  input. 
Otherwise,  wait  until  they  are  computed  by  elements  in  the  next  column  to  the  right. 

2.  Set  L(u\)  =  f(L(xi),L(x2)). 

3.  If  this  element  is  in  the  leftmost  column,  then  set  u\  using  the  decision  rule:  if  this  is  a  frozen  bit, 
use  the  known  value,  otherwise  set  u\  =  0  if  L{u\)  >  1,  or  u\  =  1  if  L{u\)  <  1.  If  this  element  is 
not  in  the  leftmost  column,  wait  for  u\  to  be  computed  by  an  element  in  the  next  column  to  the  left. 
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4.  Set  L(ii2)  =  g(L(xi),L(x2),ui). 

5.  If  this  element  is  in  the  leftmost  column,  then  set  u2  using  the  decision  rule:  if  this  is  a  frozen  bit, 
use  the  known  value,  otherwise  set  u2  =  0  if  L(u2)  >  1,  or  u2  =  1  if  L{u2)  <  1.  If  this  element  is 
not  in  the  leftmost  column,  wait  for  u2  to  be  computed  by  an  element  in  the  next  column  to  the  left. 

6.  Set  x'i  =  u\  ©  u2  and  x2  =  u2. 

We  now  return  to  using  tq  and  xt  to  represent  the  inputs  and  outputs  of  Gn-  This  decoding  method 
has  complexity  0(N  log  N)  because  there  are  N  log  N  decoding  elements,  each  of  which  executes  steps 
1  through  6  once.  Reference  [1]  proves  that  this  method  produces  the  same  results  as  direct  application  of 
the  SC  rule,  Equation  (1).  At  the  time  that  iii  is  decided,  u  \  through  Uj_i  have  already  been  decided,  and 

t,  \  W$(yi,...,yN,ui,...,ui-1\ui  =  0) 

L{Ui)  =  - 777 - . 

W$(yi,  ...,yN,ui,..  ,,Ui-i\ui  =  1) 

6.3  RECURSIVE  IMPLEMENTATION  OF  THE  DECODER 

Recall  that  Figure  9  shows  how  to  construct  Gn  using  two  copies  of  GN/2  and  N / 2  copies  of  G2. 
This  suggests  that  the  decoder  could  be  implemented  with  a  recursive  function.  This  function  would  have 
to  output  the  LRs  L(u\)  through  L(un)  in  addition  to  the  bit  decisions  u\  through  un-  It  would  perform 
the  following  steps: 

1.  Call  itself  to  decode  (aq, . . . ,  xN/2)  with  no  frozen  bits.  Let  the  output  LRs  be  called  L(v i)  through 
L(vn/2)-  The  output  bit  decisions  are  not  used. 

2.  Call  itself  to  decode  (aqv/2+i>  ■  •  • ,  xn)  with  no  frozen  bits.  Let  the  output  LRs  be  called  L(vN,2+l ) 
through  L(vn)-  The  output  bit  decisions  are  not  used. 

3.  Use  the  method  of  Section  6.1  N/2  times,  using  the  appropriate  L(v)’ s  as  input,  to  compute  lAu\ ) 
through  L(un)  and  u  \  through  u^. 

Such  an  implementation  would  be  incorrect,  because  in  steps  1  and  2  this  function  would  decide  the 
v’s  using  the  decision  rule,  but  the  D’s  should  be  computed  from  the  us.  Although  the  v’s  arc  not  needed 
by  the  calling  function,  they  are  needed  internally.  For  example,  hi  is  used  to  compute  L(v2). 

Fortunately,  we  will  show  that  G jy  has  another  decomposition  that  leads  to  a  recursive  decoder  func¬ 
tion. 

The  following  result  may  be  well-known,  but  we  have  not  seen  it  explicitly  stated  in  published  litera¬ 
ture: 

Theorem  1.  For  any  N  =  2n,  Gn  is  its  own  inverse,  i.e.,  for  any  u  6  {0,  1 }  v,  G n (G n (u))  =  u. 

Proof.  Let  F2  be  the  field  with  two  elements  0  and  1.  Gn  is  a  linear  map  from  1:0  ^2  -  so  it  can  be  de¬ 
scribed  as  multiplication  by  an  N  by  N  matrix,  which  will  also  be  called  Gn-  Equation  (70)  of  [1]  shows 

that  Gn  =  BnF®h,  where  F  =  j  ^  ,  F®n  is  its  nth  tensor  power,  and  Bn  is  a  square  matrix  called 
the  bit-reversal  operator.  Therefore  G'v'  =  (F®n)  1  .  Section  VII. B  of  [1]  shows  that  Bf]  =  Bn- 
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Also  we  see  by  direct  computation  that  FF  =  I2 .  Using  the  tensor  product  identity  {AC)  <g>  ( BD )  = 
{A<g>  B)(C <g)  D),  we  get  that  (F  <8)  F)(F <g)  F)  =  I2  (8/2  =  F-  By  induction  on  n,  we  get  F®nF®n  =  jNi 
so  {F^ny1  =  F®n.  Combining  these  equations,  we  get  G^1  =  F®nBjy.  Finally,  Proposition  16  of  [1] 
shows  that  F®nBN  =  GN.  □ 

We  can  make  a  diagram  of  Gjj1  by  flipping  Figure  9  horizontally,  thus  exchanging  the  roles  of  the  in¬ 
puts  and  outputs.  Since  =  Gy,  this  flipped  diagram  provides  a  second  decomposition  of  Gn,  as 
shown  in  Figure  1 1 . 


Figure  1 1 .  Second  decomposition  of  Gat. 


(We  have  also  replaced  2N  with  N,  i.e.,  this  figure  shows  Gn  in  terms  of  Gn/ 2>  instead  of  G2N  in 
terms  of  Gn-)  When  Rn  is  flipped  horizontally,  it  becomes  i?^1,  which  could  be  called  the  “shuffle”: 

Rn(v1,  ■  ■  -,Vn)  =  {vi,VN/2+1,V2,VN/2+2,  ■  ■  -,VN/2  ,Vn)- 

Gn/ 2  and  G2  also  become  their  inverses,  but  by  Theorem  1  this  does  not  make  a  difference. 

Now  we  can  describe  how  to  implement  the  decoder  with  a  recursive  function,  using  an  algorithm 
based  on  [1 1].  Recall  that  the  inputs  to  the  decoder  are  L(x  1)  through  L{xn),  the  set  A  of  non-frozen  in- 
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dices,  and  u(Ac),  the  frozen  bit  values.  The  recursive  function  must  output  both  u  and  x.  The  decoder 
executes  these  steps: 

1.  Split  A  into  two  pieces  A\  =  A  n  {1, , ... . ,  TV/2}  and  A2  =  A  n  {TV/ 2  +  1, . . . ,  TV},  These  are 
the  sets  of  non-frozen  indices  needed  for  the  two  recursive  calls.  Subtract  TV/ 2  from  each  element  of 
A-2,  because  mjv/2+i  through  ujv  are  called  u\  through  u^/ 2  within  the  second  recursive  call. 

2.  Split  u{Ac)  into  u(Ai)  and  ui^A^)  in  a  similar  way. 

3.  Fori  <i<  TV/2,  compute  L(tC2i-i)  =  f{L(x2i-\),  L(x2i)).  This  is  step  2  from  Section  6.2, 
repeated  for  each  of  the  TV/2  copies  of  G2  shown  in  Figure  11. 

4.  Call  itself  with  inputs  L(w  3), . . . ,  L(wn-  1)),  A\,  and  u(Ai),  producing  outputs  (u\, . . . ,  uN/2) 

and  (hi, . .  .,vN/2). 

5.  For  1  <  i  <  TV/2,  compute  L{w2i)  =  g(L(x2i-i),  L(x2i),Vi)  (step  4  of  Section  6.2). 

6.  Call  itself  with  inputs  (L(w2),  L(w 4), . . . ,  L(wn)),  A2 -  and  u(A^ ),  producing  outputs  (ttjv/2+ii  •  •  • ,  ujv) 
and  (Fjv/2+i,  •••,%)• 

7.  For  1  <  i  <  TV/2,  compute  x2i-i  =  Vi  ©  vN/2+i  and  i  =  vN/2+i  (step  6  of  Section  6.2). 

8.  Return  u  and  x. 

This  procedure  has  one  minor  shortcoming;  it  returns  all  TV  elements  of  u,  instead  of  u(Vl),  which  is  the 
estimate  of  the  length  K  encoder  input.  In  our  MATLAB®  implementation,  polarDecode.m  version  3,  this 
problem  is  handled  by  using  a  helper  function  to  execute  the  above  steps.  The  polarDecode  function 
calls  the  helper  function  to  obtain  u  (ignoring  the  x  output)  and  then  returns  u(A). 


6.4  NUMERICAL  IMPLEMENTATION 

Recall  that  /(x,  y)  =  is  one  of  the  functions  we  use  to  compute  LRs.  One  difficulty  in  using  this 
function  is  that  when  we  use  it  repeatedly,  the  LRs  get  closer  to  1.  The  approach  can  be  quite  rapid:  if  5 
and  e  arc  small,  then  /(I  +  <5, 1  +  e)  is  approximately  1  +  Se/2.  Note  that  if  neither  x  or  y  is  exactly  1, 
then  f(x,  y)  is  not  exactly  1,  but  when  f(x,  y)  is  computed  in  double-precision  arithmetic  the  result  could 
be  1,  and  this  is  likely  to  happen  when  decoding  a  large  polar  code.  This  is  a  problem  because  to  make  bit 
decisions,  we  compare  an  LR  to  1 . 

To  avoid  this  problem,  C.  Leroux  et  al.  [12]  suggested  computing  the  logarithms  of  the  LRs  (LLRs)  in¬ 
stead  of  the  LRs  themselves.  Then  we  make  bit  decisions  by  comparing  LLRs  to  0.  This  is  better  because 
a  double-precision  LLR  can  be  extremely  close  to  0.  Such  LLRs  correspond  to  LRs  extremely  close  to  1, 
which  cannot  be  distinguished  from  1  in  double  precision. 

A  hardware  implementation  of  a  decoder  would  usually  compute  with  fixed-point  numbers  instead 
of  double  precision.  Fixed-point  numbers  cannot  distinguish  numbers  near  0  as  well  as  double  precision 
numbers,  but  even  so  it  is  better  to  compute  LLRs  rather  than  LRs.  The  reason  for  this  is  that  when  decod¬ 
ing  for  a  BMS  channel,  the  LRs  x  and  1  / x  are  equally  likely  to  occur. 

For  example,  if  we  arc  using  LRs  and  arc  limited  to  6  bits,  we  probably  would  choose  to  represent  LRs 
from  1/8  to  8  in  steps  of  1/8  since  1/8  and  8  arc  equally  likely.  This  means  we  have  56  different  LRs 
greater  than  1  (probable  zeroes),  and  only  7  less  than  1  (probable  ones).  In  contrast,  we  could  represent 
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LLRs  from  —2  to  31/16  in  steps  of  1/16.  These  correspond  to  LRs  from  0.135  to  6.94,  roughly  the  same 
range,  with  a  good  balance  between  probable  zeros  and  probable  ones,  and  slightly  better  resolution  near 
the  decision  point. 

Reference  [12]  showed  that  in  decoding  a  (1024,  512)  polar  code,  using  6-bit  LLRs  resulted  in  per¬ 
formance  very  close  to  floating  point,  did  not  specify  The  range  of  LLRs  these  6  bits  represented  is  not 
specified  in  [12]. 

When  we  compute  with  LLRs,  the  formula  for  f(x,  y )  becomes 

j  f  exp(x)  exp(y)  +  1\ 

&  v  exp(x)  +  exp(y)  )  ' 

If  we  use  this  formula  as  written  when  either  x  or  y  is  near  0,  it  causes  the  same  loss  of  precision  that  re¬ 
sults  from  computing  with  LRs  near  1.  Reference  [10]  gives  the  equivalent  formula 

f(x,y)  =  2  tanh-1  (tanh  tanh  (|))  , 

which  is  accurate  near  0.  The  function  g(x,  y,  0)  becomes  x  +  y,  and  g(x,  y,  1)  becomes  y  —  x.  Reference 
[10]  points  out  that  these  functions  were  already  used  for  belief  propagation  decoding  of  LDPC  codes.  /  is 
often  called  the  box  operator,  and  is  often  replaced  by  an  approximate  formula  sign(x)  sign(y)  min(|x|,  \  y\ ) 
Reference  [12]  showed  that  this  approximation  caused  negligible  performance  degradation  for  N  =  1024, 
and  about  0.1  decibels  (dB)  performance  degradation  for  N  =  214. 

Version  3  of  polarDecode.m  provides  options  to  approximate  the  box  operator  and  to  compute  in  fixed- 
point.  These  features  were  removed  in  version  4  for  ease  of  use. 


6.5  FEASIBLE  BLOCK  SIZES 

Because  [1]  proved  that  better  results  can  be  achieved  with  large  block  sizes,  it  is  of  interest  to  deter¬ 
mine  what  block  sizes  can  be  implemented  in  practice. 

Many  papers,  including  [12-17]  have  proposed  implementing  polar  coding  in  field  programmable  gate 
arrays  (FPGAs)  or  application-specific  integrated  circuits  (ASICs).  Such  designs  can  be  much  faster  than 
software  implementations,  enabling  large  block  sizes  at  high  data  rates. 

Reference  [14]  describes  an  FPGA  implementation  that  scales  up  to  A'"  =  221.  The  limiting  factor  is 
the  amount  of  SRAM  on  the  chip.  Table  VIII  of  [14]  shows  that  block  size  221  requires  39,853,440  bits, 
which  is  available  on  an  Altera  Stratix®  V  5SGXMB6R3F43I4  FPGA.  The  maximum  data  rate  of  this 
design  is  p  =  3 3,/},  Mpbs.  Therefore,  it  decodes  a  block  in  7)/  =  ^  B  s  =  64  ms.  With  a  code  rate  of 
1/2  and  data  rate  of  10  Mbps,  the  total  latency  is  about  270  ms. 

Using  the  formula  from  [14],  but  different  numbers  of  quantization  bits  (Qc  =  6,  0  =  4),  we  found 
that  block  size  222  requires  48,236,032  bits  of  SRAM,  which  is  available  on  some  Stratix®  V  models.  De¬ 
coding  a  block  requires  12.3  million  clock  cycles.  We  cannot  predict  T,i  because  the  clock  speed  must  be 
determined  through  FPGA  synthesis. 

The  largest  block  size  we  have  found  in  the  polar  codes  literature  was  223,  in  [18].  This  code  size  was 
simulated  in  software.  The  throughput  was  not  specified. 


21 


7.  PERFORMANCE  ANALYSIS 


7.1  VIRTUAL  CHANNELS 


Figure  3  showed  that  an  information-theoretic  channel  can  encompass  modulation  and  demodulation 
in  addition  to  RF  propagation.  We  can  go  further,  and  include  the  encoder  and  decoder  in  the  channel,  or 
parts  of  them,  as  shown  in  Figure  12.  This  figure  shows  a  channel  whose  input  is  Ui  for  some  i  between  1 
and  N,  and  its  output  is  the  information  that  the  decoder  uses  to  determine  Ui,  namely  y  and  u\  through 
iii- 1-  (The  set  of  possible  output  symbols  is  yN  x  {0, 1}*_1.)  This  channel  represents  how  difficult  it  is  to 
correctly  decode  ut. 

Reference  [1]  describes  an  genie-aided  decoder  that  has  access  to  the  true  values  of  u\  through  ut-\, 
but  not  Ui,  at  the  time  it  decides  Ui.  The  genie-aided  decoder  uses  v,\  through  u,_  i  the  same  way  the  real 
decoder  uses  u\  through  Ui-\.  The  genie-aided  decoder  is  easier  to  analyze  than  the  real  decoder,  and  it 
has  a  very  useful  property: 


Input  bits 


Output  bits 


Figure  12.  Virtual  channel. 


Proposition  2.  The  genie-aided  decoder  will  decode  a  block  correctly  if  and  only  if  the  real  decoder  does. 

Proof.  If  either  decoder  makes  a  mistake,  let  Ui  be  the  first  bit  that  either  decoder  decodes  incorrectly. 

Since  the  real  decoder  did  not  make  a  mistake  before  the  fth  bit,  we  have  uj  =  Uj  for  1  <  j  <  i.  So 
both  decoders  use  the  same  input  to  decode  m,  so  they  produce  the  same  answer.  □ 

Therefore,  by  analyzing  the  genie-aided  decoder,  we  can  prove  results  about  the  BLER  of  the  real  de¬ 
coder. 

(i) 

For  1  <  i  <  N,  the  virtual  channel  WN  is  constructed  from  a  channel  W,  a  length  N  polar  encoder, 
and  a  length  N  SC  polar  decoder.  It  has  input  and  output  (y,  u\, . . . ,  u%- 1).  (Again,  the  set  of  possible 
output  symbols  is  yN  x  {0,  l}1-1.)  This  channel  represents  how  difficult  it  is  for  the  genie-aided  decoder 
to  correctly  decode  m. 

A  virtual  channel  can  be  represented  by  an  array  with  two  rows,  corresponding  to  the  inputs  0  and  1, 
and  one  column  for  each  output  symbol.  The  entries  in  the  array  are  the  transition  probabilities,  and  by 
definition  each  row  must  sum  to  1.  For  example,  if  the  outputs  of  W  are  y  =  {1,  2,  3, 4,  5,  6,  7,  8},  then 
has  8422  =  16384  different  output  symbols  ranging  from  (1, 1, 1, 1, 0, 0)  to  (8,  8, 8, 8, 1, 1). 
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However,  recall  from  Section  6.2  that  when  decoding  U3,  the  decoder  does  not  use  six  separate  values 
yi  through  1J4,  u \ ,  and  112-  Instead,  it  uses  the  LR  of  113,  which  is  computed  recursively  from  those  six  val¬ 
ues.  This  LR  is  the  ratio  of  the  two  numbers  in  one  column  of  the  array.  So  we  may  think  of  these  16384 
output  symbols  as  merely  16384  different  columns  in  an  array;  we  do  not  need  to  know  if  a  particular'  col¬ 


umn  represents  (2, 1, 1,  7,  0,  0)  or  (5, 4, 3,  5, 1,  0),  for  example.  Also,  if 


and 


are  two  columns  such 


that  a/b  =  c/d  ,  i.e.,  the  two  columns  have  the  same  LR,  they  are  effectively  identical,  and  may  be  re¬ 


placed  by  a  single  column 


a  +  c 
b  +  d 


The  channel  probabilities  of  the  virtual  channels  give  the  probabilities  of  the  various  LRs  that  can  arise 

.  This  column  means  that  in 

10%  of  all  uses  of  the  length  4  genie-aided  decoder,  113  =  0  and  L(u3)  =  0. 2/0.1  =  2.  Likewise,  in  5% 
of  all  uses  of  the  length  4  genie-aided  decoder,  U3  =  1  and  L(ii:>)  =  2.  (We  multiply  the  array  numbers  by 
0.5  because  U3  has  a  50%  chance  of  being  0  and  a  50%  chance  of  being  1.)  In  both  cases,  the  genie-aided 
decoder  decides  113  =  0,  so  this  column  contributes  0.05  to  the  BER  of  1/3.  We  compute  the  total  BER  of 
U3  by  adding  the  smaller  entry  from  each  column,  and  dividing  the  sum  by  2. 


in  the  decoder.  For  example,  suppose  one  of  the  columns  in  Wz 


(3)  • 


IS 


7.2  CHANNEL  POLARIZATION 

For  any  n  >  0,  N  =  2n,  and  1  <  i  <  N,  Equation  (22)  of  [1]  shows  how  to  compute  the  transition 

(2i—  1) 

probabilities  of  W./N  from  the  transition  probabilities  of  WN  .  Similarly,  Equation  (23)  shows  how  to 

(2  i)  ({) 

compute  the  transition  probabilities  of  W,2N  from  the  transition  probabilities  of  WN  . 

Let  I(Q)  denote  the  capacity  of  a  channel  Q,6  and  let  Q  ht  Q'  and  Q  Q"  denote  the  transfor¬ 
mations  described  in  Equations  (22)  and  (23),  respectively.  Proposition  4  of  [1]  is  that  I{Q')  +  I{Q")  = 

2 1(Q),  and  I{Q')  <  I(Q )  <  I(Q"),  with  equality  if  and  only  if  I(Q)  equals  0  or  1. 

d) 

Starting  with  W,  we  can  compute  Ii  'v  for  any  i  between  1  and  N  by  using  Equations  (22)  and  (23)  n 
times.  There  are  N  different  ways  to  choose  a  sequence  of  n  transformations,  and  these  N  sequences  yield 
N  different  virtual  channels  ll/y1 '  through  W^\  as  illustrated  in  Figure  6  of  [1],  “The  tree  process  for  the 
recursive  channel  construction.” 

Arikan  proved  (Theorem  1  of  [1])  that  for  large  enough  n,  almost  all  of  the  virtual  channels  have  ca¬ 
pacity  near  0  or  near  1.  This  effect  is  called  channel  polarization.  If  a  BMS  channel  has  capacity  near  1, 
its  BER  is  near  0,  and  if  its  capacity  is  near  0,  its  BER  is  near  0.5.  Furthermore,  the  proportion  of  virtual 
channels  with  capacity  near  1  is  close  to  I(W).  Therefore,  we  can  construct  a  code  with  rate  near  I ( W ) 
and  a  low  BER  by  freezing  all  inputs  i  such  that  I(wffl)  is  not  near  1. 


'Proposition  13  of  [1]  shows  that  if  W  is  symmetric,  then  so  is  WN v .  If  we  combine  two  columns,  then  we  should  also  com¬ 
bine  their  complements  to  preserve  the  symmetry. 

throughout  this  document,  we  use  W  for  a  channel  that  gets  its  input  from  an  encoder  and  sends  its  output  to  a  decoder. 
IT'//  always  represents  a  virtual  channel  constructed  from  IT.  We  use  Q  for  an  arbitrary  channel  that  could  be  either  of  these 
types  or  something  else. 
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8.  CODE  CONSTRUCTION 


We  construct  an  (TV,  K)  polar  code  by  choosing  K  elements  of  {1, ,  TV}  to  be  the  non-frozen  in¬ 
dices.  To  get  a  good  code  for  a  particular'  BMS  channel  W,  we  must  choose  indices  i  such  that  I(W$)  is 
near  1.  However,  computing  I(W$)  exactly  is  not  feasible  for  even  moderate  code  sizes,  because  if  W 
has  a  outputs,  then  W$  has  aN2l~1  outputs.  Various  approaches  have  been  proposed  for  estimating  the 
capacity  or  BER  of  W$. 


8.1  TAL/VARDY  METHOD 

We  consider  [19]  by  I.  Tal  and  A.  Vardy  as  the  state  of  the  art  in  code  construction.  Their  method 
uses  a  parameter  //.  A  larger  /j  means  a  better  approximation  but  also  longer  running  time.  To  construct 
a  code  of  length  N ,  the  algorithm  uses  0 ( Nf  i2  log  //  )  instructions.  Reference  [19]  gives  results  for  val¬ 
ues  of  n  ranging  from  8  to  512.  The  method  works  by  starting  with  the  transition  probabilities  of  W,  and 
then  constructing  virtual  channels  following  the  tree  process  shown  in  Figure  6  of  [1],  Reference  [19]  uses 
Q  i-A  Q  El  Q  and  Q  i— >■  Q  %  Q  for  the  channel  transformations  described  in  Equations  (22)  and  (23)  of  [1]. 

Whenever  a  channel  is  constructed  with  >  /j  outputs,  it  is  replaced  by  a  similar  channel  with  //  out- 

(i) 

puts.  At  each  leaf  of  the  tree,  we  produce  a  channel  Q,  that  is  similar  to  WN  ,  and  we  compute  the  BER  of 
Qi  to  estimate  the  BER  of  W^\ 

One  might  ask  how  a  channel  with  a  huge  number  of  outputs  could  be  similar  to  a  channel  with  a  small 
number  of  outputs.  The  answer  is  that  the  decoder  only  cares  about  the  LR  of  each  output.  If  we  have 
many  symbols  with  approximately  equal  LRs,  we  can  combine  them  to  one  symbol  whose  LR  is  within 
the  range  of  those  LRs,  and  this  change  has  very  little  effect  on  the  channel’s  BER. 

Reference  [19]  presents  two  methods  to  reduce  the  number  of  channel  outputs  to  //.  The  first  method 
is  called  “degrading  merge”.  It  is  used  in  “Algorithm  A”,  which  guarantees  that  BER(Q,)  >  BER(JE®). 

The  second  method  is  called  “upgrading  merge”.  It  is  used  in  “Algorithm  B”,  which  guarantees  that  BERfQ,)  < 
BER(W^).  Thus  BER(IE®)  is  bounded  between  two  values.7  If  these  values  are  far  apart,  it  may  be 
worthwhile  to  rerun  the  algorithms  with  a  larger  //  to  get  a  better  estimate.  Reference  [19]  gives  examples 
showing  that  the  bounds  can  be  close  together  for  large  enough  //.  It  also  presents  “Algorithm  D”,  which 
is  the  same  as  Algorithm  A  except  that  it  computes  an  additional  number  Z,  that  is  guaranteed  to  satisfy 
BER(Qi)  >  Zi  >  BER(IE® ),  giving  an  even  better  upper  bound. 

After  running  Algorithm  A  (resp.  D),  we  construct  an  (N,  K)  polar  code  by  letting  A  be  the  indices  of 
the  K  smallest  values  of  BER(Q,J  (resp.  Zj).  The  sum  of  these  K  values  is  an  upper  bound  on  the  BLER 
of  this  code  when  used  for  channel  W.  K  could  be  chosen  in  advance,  or  it  could  be  chosen  to  achieve  a 
par  ticular  BLER  bound. 

8.1.1  Implementation 

The  file  algorithmA.m  is  a  MATLAB®  implementation  of  Algorithm  A,  and  AlgorithmD.java  is  a 
Java™  implementation  of  Algorithm  D.  We  did  not  implement  Algorithm  B.  Reference  [19]  also  shows 
how  to  modify  the  degrading  merge  algorithm  to  convert  a  continuous  channel  to  a  similar  discrete  chan¬ 
nel.  This  algorithm  assumes  that  we  are  able  to  compute  integrals  of  the  transition  probabilities.  We  im- 

7These  guarantees  depend  on  exactly  implementing  the  operations  described  in  the  paper.  If  these  operations  are  implemented 
with  double  precision  arithmetic,  it  is  possible  that  accumulated  rounding  error  could  cause  violation  of  the  bounds. 
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plemented  the  continuous  degrading  merge  for  the  specific  case  of  an  AWGN  channel.  The  MATLAB® 
version  is  degrade AwgnToDiscrete.m,  and  the  Java™  version  is  a  constructor  in  Channel.java. 

The  function  algor ithmA  returns  the  capacity,  BER,  and  Bhattacharyya  parameter  (see  [1]  for  defi¬ 
nition)  of  each  Qi.  The  user  can  then  use  other  MATLAB®  functions  to  sort  the  BERs  and  choose  A. 

In  the  Java™  implementation,  the  AlgorithmD  class  has  a  compute  method  that  estimates  BER(IT^*'1) 
for  1  <  j  <  m  and  1  <  i  <  2A  This  class  also  has  a  computeAndSaveFixedRateCodes  method. 

For  each  j  this  method  sorts  the  2J  BER  estimates,  chooses  A  to  construct  a  length  2:l  code  of  a  specified 
rate,  and  computes  the  bound  for  this  code’s  BLER.  Finally,  this  method  creates  a  MATLAB®  script  file. 

When  this  script  is  run  in  MATLAB®,  it  creates  one  MAT-file  for  each  code.  The  individual  BER  esti¬ 
mates  are  not  available  to  MATLAB®,  but  they  are  saved  by  serialization  of  the  AlgorithmD  object. 


The  degrading  merge  algorithm  using  an  array  with  L  columns  to  represent  a  channel  with  2 L  outputs. 
Specifically,  if  Q{y |0)  =  a  and  <2(y|l)  =  b,  then  Q(y\0)  =  b  and  Q(y |1)  =  a.  If  a  >  b,  then  the  column 

explicitly  represents  y  and  implicitly  represents  y. 

If  y  is  the  set  of  outputs  of  Q,  then  the  outputs  of  Q  ®  Q  are  y  x  y,  and  the  outputs  of  Q  [1  Q  are 
yxyx{0,l}. 

Here  arc  some  things  we  learned  in  the  process  of  coding  the  algorithms: 


1.  Reference  [19]  gives  no  details  of  how  to  implement  QmQ  and  Q  ®  Q.  We  found  that  both  of  these 
operations  could  be  implemented  more  easily  using  an  array  that  has  one  column  for  each  pair  of 
complementary  channel  outputs.  For  0,  if  the  input  has  L  columns  representing  2 L  channel  outputs, 
then  the  output  has  2 L2  columns  representing  4 L2  channel  outputs.  We  had  to  ensure  that  these  2 L2 
columns  included  one  of  each  pair  of  complementary  channel  outputs.  We  found  that  in  Q  0  Q, 

the  complement  of  channel  output  (yi,y2)  is  (yi ,  2/2)-  So  the  boxStar  function  computes  (Q  0 
Q)  (2/1 1 2/2  I'm)  for  all  2 L  values  of  y\ ,  but  only  L  values  of  y2,  namely  those  explicitly  represented  in 
the  input. 

2.  For  ©,  if  the  input  has  L  columns  representing  2 L  channel  outputs,  then  the  output  has  4L2  columns 

representing  8L2  channel  outputs.  However,  we  found  that  for  all  yi,  r/2  £  A  ar|d  u2  €E  {0,1}, 

(Q  ©  QX2/1, 2/2, l\u2)  =  (<2  ©  Q)(yi,y2,0\u2).  This  means  that  channel  outputs  (2/1, 2/2,1)  and 
(yi,  2/2, 0)  are  equivalent,  and  may  be  combined.  This  reduces  <2  ©  <2  to  2 L2  columns  representing 
4L2  channel  outputs.  The  circleStar  function  computes  (Q®Q)(yi,  y2, 0|«2)  only,  and  doubles 
the  results  to  include  the  symbols  combined  with  them. 

3.  We  had  to  ensure  that  these  2 L2  columns  included  one  of  each  pair  of  complementary  channel  out¬ 
puts.  We  found  that  in  <2  ©  <2,  the  complement  of  channel  output  (yi,y2,  0)  is  (yi,  2/2,0).  So  the 
circleStar  function  computes  (Q  ©  <2)(yi,  y2,0\u)  for  all  2 L  values  of  yi,  but  only  L  values  of 
y2,  namely  those  explicitly  represented  in  the  input. 


4.  Recall  that  if  channel  output  y  has  transition  probabilities 


a 

b 


,  then  y  has  transition  probabilities 


.  The  boxStar  or  circleStar  function  will  work  correctly  if  its  input  includes  one  of  each 

pair  y,  y;  it  does  not  matter  which  one  is  included.  However,  the  degrading  merge  requires  its  input 
to  include  the  one  with  the  larger  probability  on  top.  The  outputs  of  boxStar  and  circleStar 
generally  do  not  satisfy  this  property,  even  if  the  input  does.  So  whenever  boxStar  or  circleStar 
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outputs  a  column 


with  a  <  b,  we  replace  this  column  with 


of  complementary  channel  outputs. 


,  which  represents  the  same  pair 


5.  The  degrading  merge  works  by  combining  channel  outputs.  Two  columns 


and 


with  sim¬ 


ilar  LRs  are  replaced  by  a  single  column 


.  The  degrading  merge  algorithm  chooses  which 


a  +  c 
b  +  d 

columns  to  combine  with  the  goal  of  minimizing  the  decrease  in  channel  capacity. 


One  might  ask,  since  we  arc  ultimately  interested  in  the  BER  of  each  virtual  channel,  why  do  we 
minimize  the  decrease  in  channel  capacity,  instead  of  minimizing  the  increase  in  BER?  We  discov¬ 
ered  that  combining  columns  of  Q  has  no  effect  on  the  BER  of  Q,  but  it  can  increase  the  BER  of 
Q  ®  Q  and  other  channels  that  occur  later  in  the  tree.  Reference  [19]  does  not  explain  how  to  com¬ 
pute  this  effect,  so  it  does  not  prove  that  this  method  of  combining  columns  minimizes  the  increase 
in  BER. 


6.  The  degrading  merge  algorithm  uses  a  heap  data  structure  to  determine  which  columns  to  combine. 
Reference  [19]  defines  a  heap  as  a  data  structure  that  associates  keys  to  data,  and  supports  four  oper¬ 
ations: 


•  Insert:  add  a  new  key-value  pair  to  the  heap 

•  GetMin:  find  the  key-value  pair  with  the  minimum  key  in  the  heap 

•  RemoveMin:  remove  the  key- value  pair  with  the  minimum  key  in  the  heap 

•  ValueUpdated:  change  the  key  of  a  key- value  pair. 

It  states  that  getMin  has  running  time  0(1),  and  the  other  three  operations  have  running  time  0(log  L) 
where  L  is  the  size  of  the  heap. 

Such  a  structure  is  often  called  a  min-heap,  to  distinguish  it  from  a  max-heap  that  supports  getMax 
and  removeMax  instead  of  getMin  and  removeMin. 

Reference  [19]  refers  to  [20]  for  the  heap  implementation.  However,  the  heap  described  in  [20] 
does  not  support  the  valueUpdated  operation.  We  searched  the  Internet  for  other  heap  implemen¬ 
tations,  and  found  that  max-heaps  typically  support  increaseKey  and  min-heaps  typically  support 
decrease  Key,  but  we  could  not  find  an  implementation  that  supports  both.  Fortunately,  we  found  a 
way  to  implement  decreaseKey  on  a  max-heap  in  0(log  L)  time. 


8.1.2  Results 

Table  I  of  [19]  shows  some  results  of  computing  a  (220, 445340)  polar  code  for  the  channel  BSC(p  = 
0.11).  It  shows  that  using  Algorithm  A  with  //  =  8,  the  resulting  BLER  upper  bound  is  5.096030e-03. 
Using  our  implementation,  we  obtained  5.083668e-03,  which  is  about  0.2%  smaller. 

Figure  2(b)  of  [19]  shows  some  results  of  computing  polar  codes  for  the  channel  AWGN(a2  =  0.1581). 
The  code  lengths  are  210  and  220,  and  for  each  length  the  code  rates  range  from  0.9  to  1.  Reference  [19] 
says  “the  value  of  /i  did  not  exceed  512.”  This  figure  is  a  graph  with  code  rate  on  the  horizontal  axis,  and 
the  base- 10  logarithm  of  the  BLER  bound  on  the  vertical  axis. 

We  used  AlgorithmD.java  to  repeat  these  calculations  for  the  N  =  210  case  using  //  =  512.  Figure  13 
shows  a  graph  of  our  results.  The  results  appeal-  to  be  very  similar  to  Figure  2(b)  of  [19]. 


27 


Figure  13.  Reproducing  a  result  from  Tal  and  Vardy. 


8.2  GAUSSIAN  APPROXIMATION  METHOD  FOR  AWGN  CHANNELS 


Recall  from  Section  7.1  that  the  transition  probabilities  of  the  virtual  channels  indicate  the  probabil- 

a 


ities  of  the  various  LRs  that  can  arise  in  the  decoder.  A  column 


represents  a  probability  (a  +  b)/ 2 


that  the  LR  equals  a/b.  The  Tal/Vardy  code  construction  method  works  by  approximating  the  virtual  chan¬ 
nel,  using  a  channel  with  at  most  p  outputs.  So  the  approximate  LR  distribution  is  described  by  a  set  of  2// 
numbers,  half  of  which  may  be  omitted  due  to  symmetry. 


Several  papers,  including  [6]  and  [21],  have  suggested  that  for  an  AWGN  channel  we  can  use  a  more 
efficient  method.  If  we  describe  the  distribution  of  LLRs  instead  of  LRs,  we  can  use  a  much  simpler  ap¬ 
proximation:  a  Gaussian  distribution.  However,  neither  of  these  papers  clearly  explains  why  this  method 
works.  We  found  the  explanation  in  [22]  by  D.  Wu,  Y.  Li,  and  Y.  Sun.  However,  then-  explanation  requires 
a  correction. 

(i) 

Previously,  we  have  used  L  to  represent  an  LR,  but  in  this  section  it  will  be  an  LLR.  Let  LN  denote 

H)  (i) 

the  LLR  of  ,  i.e.,  the  LLR  of  the  ut  computed  by  a  genie-aided  SC  decoder.  LN  is  a  random  variable 

that  depends  on  y  and  u\  through  rq_i.  1 1  and  L^1’  are  both  computed  recursively  from  two  other 

(i) 

random  variables  that  each  have  the  distribution  of  LyN^,  using  the  functions  /  and  g  described  in  Section 
6.4. 


It  is  well  known  that  when  using  optimal  decoding  of  a  linear  block  code  with  a  BMS  channel,  the 
probability  of  block  error  is  the  same  regardless  of  which  codeword  was  transmitted.8  The  SC  polar  de¬ 
coder  is  not  optimal,  but  Arikan  proved  something  similar  for  SC  decoding: 


Theorem  3.  (Corollary  1  of  [1])  For  1  <  i  <  N,  the  genie-aided  decoder’s  probability  of  incorrectly 
decoding  u,  is  the  same  regardless  of  which  codeword  is  transmitted. 


Therefore,  we  can  assume  for  convenience  that  u  and  x  are  all  zeroes.  To  see  why  this  is  helpful,  refer 
to  the  decoding  procedure  in  Section  6.3.  In  the  genie-aided  decoder,  it  is  the  ufs,  not  the  uf  s,  that  propa- 

sThis  assumes  that  ties  are  broken  randomly.  In  an  AWGN  channel  (or  any  continuous  channel),  the  probability  of  an  exact  tie 
is  0,  so  we  will  ignore  the  possibility  of  ties  in  this  section. 
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gate  from  left  to  right  through  the  decoder.  So  when  we  compute  Lf'uy,)  =  g{L{x2i-i),  L(x2i),Vi)  in  step 
5,  Vi  is  always  0,  so  g  is  always  an  addition,  not  a  subtraction. 

Now  let  W  be  an  AWGN  channel.  Recall  that  its  transition  probabilities  arc  W{y\Qi)  =  — U=  exp(—  ) 

and  W(y\\)  =  py-  exp(—  )•  For  any  output  y,  the  corresponding  LLR  is  log  and  since 

we  assume  only  0’s  arc  transmitted,  the  probability  of  this  LLR  occurring  is  W(y\0).  We  compute 


L(y)  =  log 


(W(y\0)\ 

\W(y\l)J 


log 


l 

cry/ 2n 
1 


=  log 


(y  —  i)2  |  (z/  + 1)2\\ 

2a2  +  2a2  ) ) 


~{y  -  i)2  +  {y  +  l)2  =  2 y 

2  a2  cr2 


Since  y  is  a  normally  distributed  variable  with  mean  1  and  variance  cr2,  then  L(y)  =  %  is  a  normally 
distributed  variable  with  mean  2/ a2  and  variance  a2  (A)2  =  4/cr2.  The  variance  is  twice  the  mean.  We 
define  a  consistent  normal  density  variable  (CNDV)  to  be  a  normally  distributed  variable  whose  variance  is 
twice  its  mean.  Then  L(y)  is  a  CNDV. 

Next,  Lp  =  f(L(yi),  L(y 2)),  where  L{y  1)  and  Lfa)  are  both  variables  with  the  distribution  of  L(y). 
According  to  the  “GA  principle,”  when  /  is  applied  to  two  CNDVs,  the  result  can  be  approximated  by  a 
CNDV.  Therefore,  iP  '  is  an  approximate  CNDV.  In  addition,  Lp  =  g(L(yi),  Lfa),  0)  =  L(yi )  +  Lfyc)- 
It  is  well  known  that  the  sum  of  normal  variables  is  a  normal  variable.  The  mean  of  the  sum  is  the  sum  of 
the  means,  and  the  variance  of  the  sum  is  the  sum  of  the  variances.  So  iJp  has  mean  4/a2  and  variance 
8/a2,  and  is  a  CNDV.  Applying  these  arguments  repeatedly,  we  find  that  all  of  the  virtual  channels  arc 
approximate  CNDVs.  However,  it  is  not  clear  how  much  the  approximation  degrades  when  /  is  applied 
repeatedly. 

(i) 

Since  each  LN  is  normally  distributed  with  the  variance  equal  to  twice  the  mean,  we  can  characterize 
its  distribution  using  the  mean  only,  compared  to  //  parameters  used  in  the  Tal/Vardy  method.  We  write  this 
mean  as  E(L$).  We  compute  it  recursively:  E(Lpp)  =  2E(L^p2),  and  E(Lp  ''*)  =  u ;^£,(L^y2)^, 
where  uj  is  defined  by 

u(x)  =  (fr1  (1  -  (1  -  </>0))2)  (2) 

and  4>  is  defined  by 


1  f  °o  /  T  \ 

(f>(x)  =  1 - -  /  tanh  (  -  )  exp  [— (r  —  ,t)2/(4x)1  dr. 

v4vr.T/_oo  \2J 


Finally,  BER (W$)  =  \  erfc 


).5  y/E(I$: 


(3) 


8.2.1  Error  in  Wu,  Li,  and  Sun 

The  procedure  above  computes  an  estimate  of  BER(FF^),  the  probability  of  error  of  a  genie-aided 
decoder.  However,  [22]  claims  that  this  procedure  computes  P(Ci),  defined  as  the  conditional  probability 
of  a  real  (i.e.,  not  genie-aided)  SC  decoder  incorrectly  decoding  the  Ah  bit,  subject  to  the  condition  that  all 
previous  bits  arc  decoded  correctly.  The  BLER  is  then  computed  as  1  —  n,e^(l  —  -P(Cj)). 

At  first  glance,  one  might  think  that  P(Cj)  =  BER(W^)  because  both  arc  the  probability  of  error 
in  a  situation  where  the  decoder  has  the  correct  value  of  all  previous  bits.  However,  BER(IE^)  considers 
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all  possible  values  of  y,  but  P(C* )  considers  only  values  of  y  that  would  have  caused  all  previous  bits  to 
be  decoded  correctly.  This  imposes  a  selection  bias  on  the  channel  noise.  If  several  previous  bits  were  all 
decoded  correctly,  this  probably  means  that  not  many  bits  were  flipped  in  the  channel,  which  means  that  ut 
is  also  more  likely  to  be  decoded  correctly.  Therefore,  we  expect  P(Cr)  <  BER(W$}),  although  this  is 
not  a  proof. 

P(Ci)  is  affected  by  frozen  bits.  If  the  zth  bit  is  frozen  then  P{Ci)  =  0,  so  assume  the  /'th  bit  is  not 
frozen.  If  some  of  the  first  i  —  1  bits  arc  frozen,  then  this  enlarges  the  set  of  vectors  y*  that  will  cause  the 
first  i  —  1  bits  to  all  be  decoded  correctly,  so  the  bias  is  reduced.  If  the  first  i  —  1  bits  arc  all  frozen,  then 
P(Ci)  =  BER(W$). 

For  example,  let  N  =  2,  and  let  a2  =  0.25.  Using  the  algorithm  of  [22],  we  computed  the  esti¬ 
mates  P(C\ )  =  0.04473  and  P(C2)  =  0.002363.  We  then  ran  106  trials  of  encoding,  transmission,  and 
genie-aided  decoding  with  no  frozen  bits,  producing  the  estimates  P{C\)  =  BER^Iuj1'*)  =  0.044794, 
BERflLp  )  =  0.002304,  and  P(C->  )  =  5.4217  •  10-4.  The  corresponding  3-sigma  confidence  intervals  arc 
(0.044173, 0.045415),  (0.002160,  0.002448),  and  (4.5715  •  10“4,  5.9812  •  10“4).  So  the  P(C*)’s  computed 
with  the  algorithm  agree  with  the  simulation’s  BER^I'F^i’s. 

If  we  had  an  efficient  way  to  compute  P(C*),  it  would  still  be  difficult  to  use  this  for  code  construction 
because  of  the  dependence  on  previous  frozen  bits. 


8.2.2  Implementation 


The  file  wuLiSun.m  is  our  MATLAB®  implementation  of  the  Gaussian  approximation  method. 


<j)(x)  = 


(4) 


The  function  (p  is  hard  to  compute  because  it  requires  integrating  from  —  oo  to  oo.  Reference  [22]  uses 
an  approximation: 

exp(— 0.4527.T0-86  +  0.0218),  0  <  x  <  10, 

Vfexp(-f)  (i-i)  ^>10- 

Reference  [22]  then  says  “In  practice,  we  can  store  samples  of  (4),  and  look  up  the  result  via  a  bi-search 
method.”  We  took  this  to  mean  that  the  look-up  table  is  used  to  compute  cp-1.  The  look-up  table  contains 
<p{x)  for  many  values  of  x.  To  compute  p1  (y),  we  find  the  value  nearest  y  in  the  look-up  table,  and  return 
the  corresponding  x.  Reference  [22]  says  nothing  about  how  many  values  to  include  in  the  table  or  how  to 
choose  them.  We  used  108  values  of  x,  logarithmically  spaced  from  2-100  to  2100. 

The  file  gaussianLlrXor.m  is  our  MATLAB®  implementation  of  the  function  c o.  We  found  that  for 
very  small  values  of  x,  ui(x)  >  x.  This  cannot  be  correct  because  it  would  imply  that  Q  S3  Q  has  a  higher 
capacity  than  Q.  Apparently,  it  indicates  that  the  approximation  of  <P  is  not  good  for  very  small  x.  So  we 
modified  gaussianLlrXor.m  to  return  mm(.:c,  u{x)).  This  is  an  improvement,  but  the  results  arc  still  incor¬ 
rect  for  very  small  x. 


8.2.3  Results 


To  compare  the  two  code  construction  methods,  we  let  W  be  AWGNfcr2  =  0.1581),  N  =  210,  and 
used  both  methods  to  compute  BER(IU^)  for  1  <  %  <  N.  The  function  wuLiSun  ran  in  8  seconds, 
compared  to  225  seconds  for  AlgorithmD.java.  We  also  ran  simulations  of  a  genie-aided  decoder  in  this 
channel  for  about  eight  hours,  a  total  of  479453  trials,  to  produce  empirical  estimates  of  BER(ll/^r).  We 
then  tested  if  these  were  consistent  with  the  estimates  computed  by  the  code  construction  methods.  For 
894  of  the  1024  indices,  the  number  of  observed  errors  was  <  3,  and  for  all  894  of  these,  the  computed 
BERs  were  reasonable.  For  the  remaining  130  indices,  we  tested  the  hypothesis  that  the  observed  number 
of  errors  is  a  sample  from  the  binomial  distribution  with  n  =  479453  and  p  equal  to  the  computed  BER. 
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We  approximated  the  binomial  distribution  with  a  normal  distribution  with  mean  np  and  variance  np(  1  — 
p ).  The  resulting  Z-scores  are  in  Figure  14. 


20  40  60  80  100  120 

Index 


Gaussian  approximation  method 


Figure  14.  Z  scores. 


The  Tal/Vardy  BER  estimates  appear  very  reasonable,  as  64%  are  within  one  standard  error  of  the  em¬ 
pirical  estimates,  and  96%  are  within  two  standard  errors,  which  is  roughly  what  we  expect.  Only  the  one  , 
almost  four  standard  errors  above  the  mean,  indicates  a  small  inaccuracy  in  the  method.  In  contrast,  only 
47%  of  the  Gaussian  approximation  estimates  are  within  three  standard  errors,  and  75%  are  within  ten 
standard  errors.  For  seven  of  the  indices  (not  shown),  the  empirical  BER  was  more  than  30  standard  errors 
above  the  estimate. 

On  the  other  hand,  if  we  are  constructing  a  code  with  a  predetermined  rate,  then  this  inaccuracy  does 
not  have  a  large  effect.  In  Figure  15,  the  green  line  is  the  BLER  bound  computed  with  the  Tal/Vardy  method. 
The  blue  line  is  the  BLER  bound  obtained  by  choosing  A  by  the  Gaussian  approximation  method,  and 
then  using  the  BER  estimates  from  the  Tal/Vardy  method. 

The  difference  is  never  more  than  10%  except  for  lower  values  of  K  where  the  BLER  bound  is  less 
than  10~6.  The  Gaussian  approximation  method  may  be  useful  if  a  code  must  be  constructed  quickly. 
However,  for  larger  N  or  larger  a2,  very  small  LLRs  will  arise  more  often,  so  the  inaccuracy  of  this  method 
could  have  a  larger  effect.  It  may  be  necessary  to  compute  u(x)  more  accurately  for  very  small  values  of 
x. 
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Figure  15.  Comparison  of  code  construction  methods. 
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9.  OTHER  VERSIONS  OF  POLAR  CODING 


9.1  ENCODING  WITH  F®N 

Recall  that  we  defined  the  function  Gn  as  li\-  F®n.  Section  VII.C  of  [1]  says  “In  an  actual  implemen¬ 
tation  of  polar  codes,  it  may  be  preferable  to  use  F®n  in  place  of  /i/y  F®”  as  the  encoder  mapping  in  order 
to  simplify  the  implementation.” 

Figure  16  shows  two  different  ways  to  recursively  compute  F®n.  The  base  case  is  F®1  =  F  =  G 2 ■ 
Figure  16(a)  is  a  generalization  of  Figure  9  in  [1]. 


(a)  (b) 


Figure  16.  Two  recursive  constructions  of  F®n. 

These  encoders  arc  also  built  from  interconnected  copies  of  G2.  For  example,  in  Figure  16(a),  u  1  and 
Un/2+i  arc  the  inputs  to  a  Gy, 112  and  uN/ 2+2  are  the  inputs  to  another  GV  etc.  The  total  number  of  GVs 
in  either  encoder  is  nN/ 2. 

F_N.m  is  a  MATLAB®  implementation  of  F®n.  It  uses  the  structure  in  Figure  16(a). 

The  corresponding  decoder  is  built  using  the  same  strategy  described  in  Section  6.2:  it  has  nN/ 2  de¬ 
coding  elements  that  each  decode  one  copy  of  G2,  and  the  connections  between  these  decoding  elements 
parallel  the  connections  between  the  GVs  in  the  encoder.  We  can  use  a  decoder  based  on  Figure  16(a)  or 
(b)  regardless  of  which  structure  the  encoder  follows,  because  both  encoder  structures  produce  the  same 
output.  For  the  same  reasons  given  in  Section  6.3,  the  structure  in  Figure  16(b)  is  better  for  implementing 
the  decoder  with  a  recursive  function. 


9.2  SYSTEMATIC  POLAR  CODING 

E.  Arikan  introduced  systematic  polar  coding  in  [23].  “Systematic”  means  that  the  output  is  embedded 
in  the  input.  As  before,  the  encoder  uses  the  function  Gat  or  F®n,  and  we  divide  the  set  {1, . . . ,  N}  into 
two  sets  A  of  size  K  and  Ac  of  size  Nk-  As  before,  the  inputs  of  Gat  (or  F®n)  are  called  u  \  through  un, 
the  outputs  are  called  x\  through  x\.y,  and  the  u(A()  are  frozen.  The  difference  is  that  the  encoder’s  input 
goes  into  a  preprocessor,  which  solves  for  the  unique  u(A)  that  causes  x(A)  to  equal  the  encoder  input. 
Figure  17  shows  an  (8, 4)  systematic  encoder  with  A  =  {4, 6,  7, 8}. 
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=  to  1st  input 

=  to  2nd  input 
=  to  3rd  input 
=  to  4th  input 


Figure  17.  Systematic  polar  encoder. 


The  systematic  polar  decoder  is  exactly  the  same  as  the  non-systematic  decoder,  except  that  at  the  end 
it  returns  x(A)  instead  of  u(A). 

Reference  [23]  presents  simulation  results  showing  that  the  systematic  decoder  has  the  same  BLER 
and  a  lower  BER  than  the  non-systematic  decoder.  The  lower  BER  is  surprising,  because  both  decoders 
decide  u  the  same  way.  If  errors  are  made  in  deciding  u,  these  errors  propagate  from  left  to  right  through 
the  decoder,  so  one  might  expect  to  get  even  more  errors  at  the  right  where  x  is  decided.  Instead,  we  find 
that  errors  cancel  out,  so  x  has  fewer  errors.  This  is  an  empirical  result;  as  far  as  we  know,  it  has  never 
been  proved  or  even  explained. 

The  file  systematicPolarEncode.m  is  a  MATLAB®  implementation  of  a  systematic  polar  encoder  us¬ 
ing  F®n  as  shown  in  Figure  16(a).  The  file  polarDecode.m  is  a  MATLAB®  implementation  of  a  system¬ 
atic  polar  decoder  using  F®n,  based  on  an  algorithm  described  in  reference  [24].  We  created  Figure  16(b) 
to  match  the  structure  of  this  algorithm. 
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10.  SIMULATIONS 


10.1  MEASURES  OF  PERFORMANCE 

Recall  that  in  the  Gaussian  channel  described  in  Section  3,  the  BPSK  modulator  outputs  ±1,  and  the 
channel  output  is  ±1  +  n  where  n  is  normally  distributed  with  mean  0  and  variance  cr2.  The  quantity  ^4 
is  often  called  the  bit  energy  to  noise  power  spectral  density  ratio  {E^/Nq),  and  is  usually  expressed  in 
decibels. 

However,  when  FEC  coding  is  used,  Eb  is  redefined  as  the  energy  received  per  information  bit,  as  op¬ 
posed  to  the  symbol  energy  (. Es ),  which  is  the  amount  of  energy  received  in  one  use  of  the  channel.  Us¬ 
ing  an  (N,  K )  code,  K  bits  of  information  arc  sent  in  N  uses  of  the  channel,  so  KEb  =  NES.  We  have 
Es/Nq  =  2 4',  and  Eb/N o  =  2  n2k-  Es/Nq  is  a  property  of  the  channel,  but  Eb/No  depends  on  the  code 
used  with  that  channel. 

In  the  polar  codes  literature,  most  reported  simulation  results  arc  for  Gaussian  channels,  and  these  re¬ 
sults  arc  usually  displayed  with  a  graph  called  a  waterfall  plot.  In  these  graphs,  the  vertical  axis  usually 
shows  BER  or  BLER  on  a  logarithmic  scale,  and  the  horizontal  axis  usually  shows  Eb/No  in  decibels. 
Figure  1  of  [23]  is  a  good  example.  This  graph  compares  the  performance  of  a  systematic  polar  code  with 
the  corresponding  non-systematic  code,  so  the  graph  has  one  line  for  each  code.  Each  of  these  codes  was 
tested  in  13  different  Gaussian  channels,  resulting  in  13  points  on  each  line. 


10.2  ESTIMATING  BER 

We  wrote  the  MATLAB®  function  codeTest2  for  simulating  polar  codes  in  Gaussian  channels  and 
estimating  their  BERs.  It  repeatedly  performs  the  following  steps: 

1.  Generate  K  random  bits. 

2.  Input  those  bits  to  the  encoder,  which  outputs  N  bits. 

3.  Map  these  N  bits  to  ±1  and  add  normally  distributed  noise. 

4.  Compute  the  LLRs  of  the  results. 

5.  Input  these  N  LLRs  to  the  decoder,  which  outputs  K  bits. 

6.  Compare  the  output  bits  to  the  original  random  bits,  and  count  errors. 

One  difficulty  is  deciding  when  to  stop.  A  common  rule  of  thumb  in  digital  communications  is  to  stop  a 
test  when  100  bit  errors  have  occurred  ([25]).  However,  this  rule  is  based  on  the  assumption  that  each  bit  is 
incorrect  with  probability  p,  independent  of  all  other  bits.  This  assumption  is  incorrect  when  FEC  coding 
is  used.  We  might  decode  several  blocks  with  no  errors,  and  then  get  more  than  100  bit  errors  in  a  single 
block.  This  tells  us  little  about  how  frequently  block  errors  will  occur,  and  how  many  bit  errors  will  be 
included  in  each  block  error.  Figure  1 8  is  a  histogram  showing  various  numbers  of  bit  errors  found  in  one 
block  error. 

Suppose  we  decode  n  blocks  and  get  m  block  errors.  We  keep  track  of  the  number  of  bit  errors  in  each 
of  these  rn  blocks:  x\,  X2,  ■  ■  ■ ,  xm.  Let  x  be  the  mean  of  {x\,X2,  •  •  • ,  xm}  and  let  s  be  the  standard  devi¬ 
ation.  Then  the  observed  BER  is  —  4=4  —  is  the  observed  BLER  and  we  will  call  T  the  observed  bit  error 

n  K  n  K 
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Figure  18.  177  block  errors  in  a  (1024,  512)  polar  code  with  Efa/A/Q  =  2.5  decibels. 

rate  of  erroneous  blocks  (BEREB).  We  can  estimate  BLER  and  BEREB  separately.  Since  we  plot  BER  on 
a  logarithmic  scale,  our  metric  is  the  relative  error 

(true  BER)  —  (observed  BER) 

(observed  BER) 

Our  goal  is  that  the  standard  error  of  our  estimates  should  be  about  10%.  We  can  make  an  exception  for 
very  low  BERs.  A  commonly  used  target  BER  is  10  5  (e.g.,  in  Table  XVI  of  [26]),  so  if  we  are  confident 
that  a  BER  is  below  10-5  we  are  not  concerned  with  the  relative  error. 

The  relative  error  of  BER  is  the  product  of  the  relative  errors  of  BLER  and  BEREB.  For  small  errors 
these  are  roughly  additive,  e.g.,  if  we  overestimate  BLER  by  3%  and  BEREB  by  5%,  then  we  overestimate 
BER  by  about  8%.  So  if  the  standard  errors  of  our  estimates  of  BLER  and  BEREB  are  a%  and  b%,  respec¬ 
tively,  then  the  standard  error  of  our  BER  estimate  is  about  \J af  +  b2%. 

If  the  true  BLER  is  p,  then  m  will  have  a  binomial  distribution  with  mean  tip  and  standard  deviation 
y/np(  1  —  p).  For  small  p,  this  is  roughly  y/np.  Therefore,  if  we  have  observed  m  block  errors,  the  stan¬ 
dard  error  of  BLER  is  about  s/m,  and  relative  standard  error  is  about  1  / yfm. 

If  m  >  30,  then  x / K  will  have  an  approximately  normal  distribution  with  standard  deviation  s / y/m, 
so  the  standard  error  of  our  BEREB  estimate  is 

sK 

Xy/m 

For  1  <  m  <  30,  this  formula  is  less  accurate  and  tends  to  underestimate  the  standard  error.  The  function 
codeTest2  roughly  compensates  by  using  y/m  —  1  in  the  denominator  in  place  of  y[m\ 


Xy/m  —  1 

The  function  codeTest2  keeps  track  of  the  elapsed  time  as  it  repeatedly  encodes  and  decodes.  Ev¬ 
ery  five  minutes,  it  approximately  computes  the  relative  standard  error  of  the  BER  estimate.  It  stops  if  the 
relative  standard  error  is  <  0.1.  It  also  stops  if  the  BER  estimate  is  at  least  two  standard  errors  below  10-5, 
indicating  that  the  upper  limit  of  a  95%  confidence  interval  is  <  10-5. 

Equation  (5)  cannot  be  used  when  m  =  0.  In  this  case,  we  estimate  the  BLER  to  be  (0.95) 1//n  be¬ 
cause  this  is  the  upper  limit  with  95%  confidence.  The  function  codeTest2  conservatively  estimates  the 


36 


BEREB  to  be  1/2, 9  and  stops  if  the  product  of  these  two  estimates  is  <  10-5.  None  of  our  tests  stopped 
with  zero  block  errors.  A  more  complicated  formula  should  be  used  when  m  =  1,  but  we  did  not  use  such 
a  formula;  we  used  equation  5  for  all  m  >  0.  When  m  =  1  this  formula  evaluates  to  NaN  (not  a  number) 
so  it  was  not  possible  for  a  test  to  stop  with  m  =  1. 

The  function  codeTest2  also  saves  the  results  to  a  MAT-file  every  5  min.  This  is  useful  because  if 
the  execution  is  interrupted,  at  most,  5  min  of  data  is  lost.  The  function  codeTest2  can  also  reload  the 
MAT-file  and  continue  the  test  from  the  point  it  was  last  saved. 


10.3  RESULTS  OF  CODETEST2 

The  blue  lines  in  Figure  19  are  waterfall  plots  for  polar  codes  of  rate  1/2.  The  codes  were  constructed 
using  AlgorithmD.java,  and  the  BERs  were  estimated  using  codeTest2.10  A  blue  line  that  meets  the 
bottom  of  the  graph  indicates  that  a  BER  estimate  less  than  10~5  was  computed. 


We  mentioned  previously  that  each  line  on  a  waterfall  plot  shows  the  results  of  the  same  code  at  dif¬ 
ferent  Eb/No  levels.  However,  in  the  blue  lines  of  Figure  19,  each  point  on  the  line  represents  a  code  con¬ 
structed  for  that  particular  Eb/N^.  This  is  a  common  practice  in  the  polar  codes  literature  because  polar 
codes  are  easily  customized  for  different  Eb/No  levels. 


10.4  QUATERNARY  PHASE  SHIFT  KEYING  SIMULATIONS 

We  now  give  another  example  of  a  channel: 

Example  3:  Gaussian  Channel  with  Quaternary  Phase  Shift  Keying  (QPSK).  This  channel  repre¬ 
sents  QPSK  modulation  and  a  line-of-sight  RF  channel  with  negligible  multipath.  The  QPSK  modulator 

9  Getting  a  better  estimate  of  BEREB  is  possible  by  extrapolating  from  other  tests  that  produced  more  block  errors.  The  func¬ 
tion  codeTest2  can  use  a  user-supplied  BEREB  bound  instead  of  1/2.  We  did  not  use  this  feature. 

10The  green  line  is  a  waterfall  plot  for  a  convolutional  code,  which  is  explained  in  Section  10.4.  It  was  computed  using  run- 
QPSK.m. 
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first  maps  a  pair  of  bits  to  a  point  in  the  in-phase/quadrature  (I/Q)  plane.  There  are  several  QPSK  map¬ 
pings.  We  use  (0,  0)  ^  (0, 1)  ^  ^),  (1,0)  ^  ( ^,  (1, 1)  ^  (^,  ^).  The 

I/Q  point  is  then  converted  to  a  sinusoidal  waveform  that  is  transmitted.  The  receiver  rescales  the  wave¬ 


form  to  compensate  for  the  attenuation.  The  demodulator  maps  the  waveform  back  to  I/Q  space,  resulting 
in  +  m,  +  no),  where  rij  and  tiq  arc  random  noise  variables  that  arc  assumed  to  be  indepen¬ 

dent,  normally  distributed,  with  0  mean  and  equal  variance. 


For  any  a  >  0,  the  Gaussian  QPSK  channel  with  noise  variance  a1  has  input  set  X  =  {0,  l}2,  output 
set  y  =  M2,  and  transition  probabilities 


W(yi,y2\u1,u2) 


/ 


Note  that  this  is  not  a  binary  channel  because  it  has  more  than  two  inputs.  It  is  memoryless,  and  sym¬ 
metric  in  an  appropriate  sense. 

Like  the  BPSK  Gaussian  channel,  the  QPSK  Gaussian  channel  has  the  same  Es/Nq  =  -^X.  However, 
twice  as  much  information  is  sent  in  each  use  of  the  channel,  so  E^/Nq  =  4  ™K. 

Section  4.8.4  of  [27]  shows  that  each  use  of  the  QPSK  Gaussian  channel  is  equivalent  to  two  consecu¬ 
tive  uses  of  the  BPSK  Gaussian  channel  with  the  same  E^/Nq.  So  codeTest2  can  simulate  QPSK  if  we 
interpret  two  consecutive  bits  as  the  I  and  Q  components  of  the  same  signal.  We  also  wrote  runQPSK.m, 
a  MATLAB®  script  that  explicitly  simulates  QPSK.  This  script  can  simulate  polar  coding,  and  it  can 
also  simulate  convolutional  coding,  using  functions  from  the  MATLAB®  Communications  Toolbox.  An 
(TV,  K.  I )  convolutional  code  is  an  FEC  code  that  takes  K  bits  as  input  and  outputs  TV  bits.  However,  these 
TV  bits  are  determined  not  only  by  the  input  but  also  by  the  previous  l  —  1  inputs.  Convolutional  codes  are 
not  block  codes. 

Figure  20  shows  BER  estimates  computed  with  runQPSK.m.  These  include  polar  codes  of  three  dif¬ 
ferent  lengths.  They  also  include  two  convolutional  codes,  a  (2, 1,  7)  and  a  (2, 1,  9).  The  former  is  used  in 
several  U.S.  military  communications  systems  (for  example,  see  [26]). 


10.5  SIMULINK®  RESULTS 

We  built  Simulink®  models  representing  QPSK  Gaussian  channels.  Figure  21  shows  results  for  three 
different  FEC  codes:  a  (1024,  512)  turbo  code,  a  (64800,  32400)  LDPC  code,  and  a  (61,  51)  Reed-Solomon 
code. 
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Polar  Code,  K=64,  N=128 
Polar  Code,  K=256,  N=512 
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CC,  ConLen=9,  G=[56 1,753] 
Uncoded  Theory 
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Figure  20.  BER  estimates  of  rate  1/2  codes  in  QPSK  Gaussian  channels. 


Figure  21.  BER  estimates  in  QPSK  Gaussian  channels. 
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11.  CONCLUSION 


Our  FY 14  progress  has  consisted  mostly  of  implementing  published  algorithms  and  reproducing  pub¬ 
lished  results.  We  have  implemented  and  tested  several  methods  for  encoding,  decoding,  and  code  con¬ 
struction. 

Our  preliminary  conclusion  is  that  polar  codes  can  outperform  the  FEC  currently  used  in  some  modes 
of  Navy  communications  systems.  Polar  codes  may  also  be  able  to  outperform  other  codes  that  could  be 
used  to  replace  the  current  codes.  Polar  codes  are  more  likely  to  be  useful  at  higher  data  rates.  Further  re¬ 
search  is  needed  to  test  polar  codes,  using  more  realistic  models  of  Navy  systems  and  their  RF  environ¬ 
ments.  We  are  now  well  positioned  to  do  this  further  research. 
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code,  a  Reed-Solomon  code,  and  three  convolutional  codes. 

The  team’s  preliminary  conclusion  is  that  polar  codes  can  outperform  the  FEC  currently  used  in  some  modes  of  Navy 
communications  systems.  Polar  codes  may  also  be  able  to  outperform  other  codes  that  could  be  used  to  replace  the  current  codes. 
Polar  codes  are  more  likely  to  be  useful  at  higher  data  rates.  Further  research  is  needed  to  test  polar  codes,  using  more  realistic 
models  of  Navy  systems  and  their  RF  environments.  Space  and  Naval  Warfare  Systems  Center  Pacific  is  now  well  positioned  to 
do  this  further  research. 

15.  SUBJECT  TERMS 

Mission  Area:  Communications 

polar  codes  polar  encoder  polar  construction  Gaussian  approximation  method  polar  coding  algorithms 

block  codes  polar  decoder  Tal/Vardy  Method  forward  error  correction 
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